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(54) maintaining end-to end synchronization on a telecommuncations connection 



(57) A method and an arrangement for maintaining 
end-to-end synchronization on a telecommunications 
connection transmitting data in frames substantially in 
real time and using synchronized end-to-end encryp- 
tion, wherein at least a part of the telecommunications 
connection is a packet-switched connection (PDN), in 
which case the reproduction delay of the data to be 
transmitted can be increased by adding one or more ex- 
tra frames (72) to the frame string (75) being transmit- 
ted, wherein the arrangement comprises means (MS, 



TE) for defining on the basis of the number of received 
frames an initialisation vector value corresponding to a 
frame received at the receiving end of the telecommu- 
nications connection and used in decrypting the frame, 
and means (GW : TE) for adjusting the reproduction de- 
lay that are arranged to mark the frame to be added to 
increase the reproduction delay as an extra frame, and 
the means (MS, TE) for defining the initialisation vector 
value are arranged to count only the frames not marked 
as extra frames in the number of received frames. 
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Description 

BACKGROUND OF THE INVENTION 

[0001 ] The invention relates to a method and appara- 
tus for maintaining an end-to-end synchronization on a 
telecommunications connection. 
[0002] In telecommunications systems, such as an of- 
ficial network, it is very important that electronic inter- 
ception of the traffic is not possible. The air interface is 
typically encrypted, so even though the radio traffic is 
monitored, an outsider cannot decrypt it. In an infra- 
structure, the traffic is, however, not necessary encrypt- 
ed, so the traffic, such as speech, can be decrypted us- 
ing the codec of the system in question. Even though 
an outsider cannot in principle listen to the speech flow 
inside the infrastructure, this is a possible security risk 
forthe most demanding users. Therefore, a solution has 
been developed in which speech can be encrypted with 
end-to-end encryption. An example of a system ena- 
bling the end-to-end encryption is the TETRA (Terres- 
trial Trunked Radio) system. 

[0003] The basic idea of end-to-end encryption is that 
a network user, such as an authority, can encrypt and 
decrypt traffic independently and regardless of the used 
transmission network for instance in terminal equip- 
ment. 

[0004] In the TETRA system, for instance, when em- 
ploying end-to-end encryption, the sender first codes a 
60-ms voice sample using a TETRA codec, thus creat- 
ing a plaintext sample. The transmittingterminal creates 
an encrypted sample using a certain key stream seg- 
ment. The encrypted sample is then transmitted to the 
network. The recipient decrypts the encrypted sample 
by using the same key stream segment, thus again ob- 
taining a plaintext sample. 

[0005] To prevent the encryption from being broken, 
the key stream segment is changed continuously, which 
means that each frame comprising a 60-ms voice sam- 
ple is encrypted with its own key stream segment. Both 
encryption key stream generators should thus agree on 
what key stream segment to use for each frame. This 
task belongs to synchronization control. For the task, 
synchronization vectors are used that are transmitted 
between terminals by means of an in-band signal. 
[0006] The encryption key stream generator gener- 
ates a key stream segment on the basis of a certain key 
and an initialisation vector. The keys are distributed to 
each terminal participating in the encrypted call. This is 
part of the terminal settings. A new key stream segment 
is thus generated once in every 60 milliseconds. After 
each frame, the initialisation vector is changed. The sim- 
plest alternative is to increment it by one, but each en- 
cryption algorithm contains its own incrementation 
method that can be even more complex to prevent the 
breaking of the encryption. 

[0007] The task of synchronization control is to make 
sure that both ends know the initialisation vector used 



to encrypt each frame. Forthe encrypterand decrypter 
to agree on the value of the initialisation vector, a syn- 
chronization vector is transmitted at the beginning of the 
speech item. In case of a group call, joining the call must 
5 be possible even during a speech item. Therefore, the 
synchronization vector is transmitted continuously for 
instance 1 to 4 times a second, in addition to the initial- 
isation vector, the synchronization vector contains for in- 
stance a key identifier and CRC error check so that the 
10 terminal can verify the integrity of the synchronization 
vector. The recipient thus counts the number of frames 
transmitted after the synchronization vector and the en- 
cryption key stream generator generates a new initiali- 
sation vector on the basis of the initialisation vector re- 
's ceived last and the number of frames. 

[0008] A data transmission network may comprise 
one or more packet-switched connections, for instance 
IP (Internet Protocol) connections, in which data is 
transmitted using the voice over IP technology, for in- 
20 stance. RTP (Real Time Protocol) is one standard pro- 
tocol for transmitting real-time data, such as sound and 
video images in an IP network, for instance. The IP net- 
work typically causes a varying delay in packet trans- 
mission. For speech intelligibility, for instance, a varying 
25 delay is very deleterious. To compensate for this, the 
receiving end of the RTP transmission buffers incoming 
packets to a jitter buffer and reproduces them at a given 
reproduction time. A packet arriving before the repro- 
duction time participates in the reconstruction of the 
30 original signal. A packet arriving after the reproduction 
time remains unused and rejected. 
[0009] On one hand, a real-time application requires 
an as short end-to-end delay as possible, and conse- 
quently the reproduction delay should be reduced. On 
35 the other hand, a long reproduction delay allows a long 
time for the packets to arrive and thus, more packets 
can be accepted. The value of the reproduction delay 
should thus be adjusted continuously according to the 
network conditions. Most RTP algorithms have a facility 
^0 that adjusts the reproduction delay automatically ac- 
cording to the network conditions to improve sound 
quality. The reproduction delay can be shifted 60 ms for- 
ward, for instance, by having the IP gateway create a 
60-ms replacement packet. In other words, an extra 
45 frame is added to the frame flow being transmitted. 
[0010] A problem with the arrangement described 
above is that if synchronized end-to-end encryption cod- 
ing is used and an extra frame is added to the frame 
flow, the result is that the frame counter at the receiving 
50 end is one frame ahead in relation to the incoming 
frames and the key stream segment of the receiving end 
no longer matches the key stream segment of the trans- • 
mitting end. 

[0011] Increasing the reproduction delay in the middle 
55 of a speech item, for instance, thus has the conse- 
quence that end-to-end synchronization is lost and the 
encrypted speech can no longer be decoded. This con- 
tinues until the transmitting end sends a new synchro- 
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nization vector to synchronize the receiving end. This 
phenomenon can be prevented in such a manner that 
in semi-duplex calls, for instance, the reproduction delay 
is changed only after speech items. If the speech items 
are long, the reproduction delay can then be changed 
disadvantageous^ infrequently: the quality of speech 
may be poor until the end of the entire speech item, be- 
cause the reproduction delay cannot be changed earlier. 
Further, in duplex calls, for instance, in which there are 
no speech items and the terminal transmits continuous- 
ly, the reproduction delay cannot be changed at all dur- 
ing the call, if loss of synchronization is to be avoided. 

BRIEF DESCRIPTION OF THE INVENTION 

[0012] It is thus an object of the invention to develop 
a method and an apparatus implementing the method 
so as to solve the above-mentioned problems. The ob- 
ject of the invention is achieved by a method and system 
that are characterized by what is stated in the independ- 
ent claims 1, 7, 13, and 22. Preferred embodiments of 
the invention are disclosed in the dependent claims. 
[0013] The invention is based on the idea that if the 
reproduction delay is increased during a data transmis- 
sion, such as speech item or call, the frame added to 
increase the reproduction delay is marked as an extra 
frame and only the frames not marked as extra frames 
are counted in the number of frames received at the re- 
ceiving end, in which case the extra frames added to 
increase the reproduction delay will not mix up the frame 
counter used in end-to-end encryption and there will be 
no gaps in decryption or decoding. 
[0014] The method and system of the invention pro- 
vide the advantage that they also enable the increasing 
of the reproduction delay during data transmission with- 
out causing a disruption in the decoding of the encrypted 
data. 

BRIEF DESCRIPTION OF THE INVENTION 

[0015] The invention will now be described in greater 
detail by means of preferred embodiments and with ref- 
erence to the attached drawings in which 

Figure 1 shows a block diagram of the structure of 
a TETRA system, 

Figure 2 shows a block diagram of the operation of 
end-to-end encryption, 

Figure 3 shows the calculation of an initialisation 
vector by the recipient, 

Figure 4 shows a diagram of the structure of an RTP 
packet . 

Figure 5 shows the operation of an RTP algorithm, 
Figure 6 shows a diagram of the probability of arrival 
of RTP packets as a function of the transmission 
time, and 

Figure 7 shows a diagram of increasing the repro- 
duction delay. 



DETAILED DESCRIPTION OF THE INVENTION 

[0016] In the following, the invention will be described 
by way of example in a TETRA system. The intention is, 
5 however, not to restrict the invention to a given telecom- 
munications system or data transmission protocol. The 
application of the invention to other systems is apparent 
to a person skilled in the art. 

[0017] Figure 1 shows an example of the structure of 

10 the TETRA system. Even though the figure and the fol- 
lowing description refer to network elements according 
to the TETRA system, this does not in any way restrict 
the application of the invention to other telecommunica- 
tions systems. It should be noted that the figure only 

15 shows the elements essential for understanding the in- 
vention, and the structure of the system can differ from 
what is stated without it having any significance to the 
basic idea of the invention. It should also be noted that 
an actual mobile system could comprise an arbitrary 

20 number of each element. Mobile stations MS are con- 
nected to TETRA base stations TBS over a radio path. 
The mobile stations MS can also use a direct mode to 
communicate with each other without using the base 
stations TBS. Each base station TBS is connected over 

25 a connecting line to one of the digital exchanges for 
TETRA DXT of the fixed transmission network. The 
TETRA exchanges DXT are connected over a non- 
switched connection to other exchanges and to a TET- 
RA node exchange DXTc (digital central exchange for 

30 TETRA, not shown) which is an exchange to which other 
exchanges DXT and/or other node exchanges DXTc are 
connected to provide alternative traffic routes. Possible 
external connection interfaces to a public switched tel- 
ephone network PSTN, integrated services digital net- 

35 work ISDN, private automatic branch exchange PABX 
and packet data network PDN can reside in one or more 
exchange DXT. Of the above-mentioned connection in- 
terfaces, the figure shows a connection to a packet data 
network PDN through a gateway GW. The task of the 

40 gateway GW is to convert the circuit-switched data com- 
ing from the exchange DXT into packet-switched data 
for the packet data network PDN and vice versa. This 
way, terminal equipment TE connected to a packet- 
switched data network PDN can communicate with the 

45 TETRA network. The gateway GW can be a separate 
network element or part of the exchange DXT, for in- 
stance. In addition, the figure shows a dispatcher sys- 
tem DS connected to the exchange DXT and made up 
of a dispatcher station controller DSC and a dispatcher 

so workstation DWS connected to it. The administrator of 
the dispatcher system controls the calls and other func- 
tions of the mobile stations MS through the workstation 
DWS. 

[0018] Figure 2 illustrates the operation of end-to-end 
55 encryption. When using end-to-end encryption, the 
sender 20 first codes a 60-ms voice sample using a 
TETRA codec that produces a plaintext sample (P). The 
terminal creates a key stream segment KSS having the 
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length of P in an encryption key stream generator 21. 
An encrypted sample (C) is obtained by executing a bi- 
nary XOR operation in block 22: 

C = P xor KSS 

[001 9] The encrypted sample is then transmitted to a 
transmission network 29. A recipient 30 executes the 
same XOR operation in block 28 by using the same key 
stream segment that again produces a plaintext sample 
P: 



P = C xor KSS 

[0020] To prevent the breaking of the encryption, the 
key stream segment KSS is changed continuously and 
each frame is encrypted by its own key stream segment. 
Both encryption key stream generators 21 and 27 
should thus agree on which key stream segment to use 
for each frame. This is a task of synchronization control 
23 and 26. For the task, synchronization vectors trans- 
mitted between the terminals by means of an in-band 
signal are used. 

[0021] The encryption key stream generator (EKSG) 
21 and 27 generates the key stream segment (KSS) on 
the basis of a cipher key (CK) and an initialisation vector 
(IV). A new key stream segment is thus generated once 
for every 60 ms. 

KSS = EKSG (CK, IV) 

[0022] The initialisation vector is changed after each 
frame. The simplest alternative is to increment it by one, 
but each encryption algorithm contains its own incre- 
mentation method that can be even more complex to 
prevent the breaking of the encryption. 
[0023] The task of synchronization control 23 and 26 
is to make sure that both ends 20 and 30 know the ini- 
tialisation vector used to encrypt each frame. For the 
encrypter 20 and decrypter 30 to agree on the value of 
the initialisation vector, a synchronization vector (SV) is 
transmitted at the beginning of the speech item. In case 
of a group call, joining must be possible even during a 
speech item. Therefore, the synchronization vector is 
transmitted continuously approximately 1 to 4 times a 
second. In addition to the initialisation vector, the syn- 
chronization vectorcontains for instance a key identifier 
and CRC error check so that the terminal can verify the 
integrity of the synchronization vector. 
[0024] The recipient 30 thus counts the number (n) of 
frames transmitted after the synchronization vector. The 
encryption key stream generator 27 of the recipient 30 
generates a new initialisation vector IV on the basis of 
the initialisation vector received last and the number of 
frames. The initialisation vector IV counting performed 



by the recipient is illustrated in Figure 3 that shows a 
frame string to be transmitted. Each frame comprises 
two speech blocks P1 and P2, as shown in the figure for 
one frame. In the presented string, frames 1 , 6, 1 2 and 
13 contain in their second speech block the synchroni- 
zation vector SV that indicates the number of the initial- 
isation vector IV. 

[0025] Both ends 20 and 30 should agree on how to 
encrypt a call. The synchronization control units 23 and 
26 at both ends communicate with each other by means 
of U-stolen speech blocks. The transmitting terminal uti- 
lizes one or two speech blocks inside the frame for its 
own purpose. This takes place in block 24. This is indi- 
cated to the receiving terminal by setting first 3 control 
bits appropriately inside the frame. This way, the infra- 
structure 29 understands thatthis isterminal-to-terminal 
data and, on the basis of it, it transmits the data trans- 
parently without changing it. In addition, the receiving 
terminal detects that there is no speech data in the 
speech block in question and does not forward them to 
the codec, but processes them appropriately (in other 
words, the synchronization control data is filtered to the 
synchronization control 26 in block 25) and generates a 
replacement sound to replace the stolen speech. Steal- 
ing a speech block destroys 30 ms of speech. This 
would cause a break in speech, thus reducing its quality 
and making it more difficult to understand. To avoid this, 
the TETRA codec contains a replacement mechanism, 
in reality, a user does not experience the missing 
speech as inconvenient, unless speech blocks are sto- 
len more than 4 times a second. The cipher keys CK are 
distributed to each terminal taking part in the encrypted 
call. This is part of the settings of the terminals. 
[0026] The packet-switched data network PDN 
shown in Figure 1 can for instance be the Internet that 
uses TCP/IP protocols. TCP/IP is the name of a family 
of data transmission protocols used in a local area net- 
work or between local area networks. The protocols are 
IP (Internet Protocol), TCP (Transmission Control Pro- 
tocol, and UDP (User Datagram Protocol). The family 
also contains other protocols intended for certain serv- 
ices, such as file transfer, e-mail, remote operation, etc. 
[0027] TCP/IP protocols are divided into layers: data 
link layer, network layer, transport layer and application 
layer. The data link layer is responsible for the physical 
connection of a terminal to the network. It is mainly as- 
sociated with the network interface card and driver. The 
network layer is often called the Internet or IP layer. This 
layer is responsible for transmitting packets inside the 
network and for instance for the routing from one device 
to another on the basis of an IP address. IP provides the 
network layer in the TCP/IP protocol family. The trans- 
port layer provides a data flow service between two ter- 
minals for the application layer and directs the flows into 
the correct application in the terminal. The Internet pro- 
tocol has two transfer protocols: TCP and UDP. A sec- 
ond task of the data link layer is to direct packets to the 
correct applications on the basis of port numbers. TCP 
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provides a reliable data flow from one terminal to anoth- 
er. TCP chops data into suitable packets, acknowledges 
received packets and monitors that transmitted packets 
are acknowledged as received by the other end. TCP is 
responsible for a reliable transfer from end to end, i.e. 
the application need not take care of it. UDP, on the other 
hand, is a much simpler protocol. UDP is not responsible 
for the arrival of data, and if this is required, the appli- 
cation layer must take care of it. The application layer is 
responsible for the data processing of each application. 
[0028] RTP is a standard Internet protocol for trans- 
ferring real-time data, such as sound and video images. 
It can be used for media order services or interactive 
services, such as IP calls. RTP is made up of a media 
part and a control part. The latter is called RTCP (Real 
Time Control Protocol). RTP's media part contains sup- 
port for real-time applications. This includes time sup- 
port, loss detection, security support and content iden- 
tification. RTCP enables real-time conferences within 
groups of different sizes and the evaluation of the end- 
to-end service quality. It also supports the synchroniza- 
tion of several media flows. RTP is designed to be inde- 
pendent of the transmission network, but in the Internet, 
RTP generally uses IP/UDP. The RTP protocol has 
many features that enable a real-time end-to-end data 
transmission. At each end, an audio application trans- 
mits regularly small samples of audio data that can be 
30 ms long, for instance. An RTP header is attached to 
each sample. The RTP header and the data are packed 
in a UDP and IP packet. 

[0029] The content of a packet is identified in the RTP 
header. The value of this field indicates which coding 
method is used (PCM, ADPCM, LPC, etc.) in the pay- 
load of the RTP packet. In the Internet, as in other packet 
networks, packets can arrive in an arbitrary order, be 
delayed for a varying time, or even disappear complete- 
ly. To prevent this, each packet in a certain flow is given 
its own sequence number and time stamp, on the basis 
of which the received flow arranges itself according to 
the original flow. The sequence number is increased by 
one for each packet. By means of the sequence number, 
the recipient is able to detect a missing packet and also 
evaluate packet loss. 

[0030] The time stamp is a 32-bit number. It indicates 
the starting moment of sampling. To calculate it, a clock 
increasing monotonously and linearly with time is used. 
The frequency of the clock should be selected in such 
a manner that it is suitable for the content, fast enough 
for calculating jitter and to enable synchronization. For 
instance, when using the PCM-A law converting meth- 
od, the clock frequency is 8000 Hz. When transmitting 
240-byte RTP packets, which corresponds to 240 PCM 
samples, the time stamp is increased by 240 for each 
packet. The length of an RTP header is 3 to 1 8 words 
(32-bit word). Figure 4 illustrates the form of an RTP 
packet. The meanings of the fields are as follows. V = 
version, the used RTP version, currently 2. Filling = the 
packet includes filling bits, the last bit indicates how 



many. Extension = exactly one header extension after 
the packet. PM = the number of service sources indi- 
cates the number of data sources in the packet. A mark- 
er can be used to indicate significant events, such as 
5 frame borders. HT = the type of payload indicates the 
type of media in the payload. The serial number is in- 
creased by one for each transmitted data packet. It 
helps detect packet loss and disorder. The initial value 
is random. The time stamp indicates the sampling mo- 
10 ment of the first byte. It is used for synchronization and 
jitter calculation. The initial value is random. SSRC = a 
randomly selected identifier of the synchronization 
source. Indicates the joining point of sources or the orig- 
inal sender, if there is only one source. CSRC list is the 
is list of sources in this packet. 

[0031] The Internet causes a varying delay in the 
transfer of audio packets. For speech intelligibility, a var- 
ying delay is very deleterious. To compensate for this, 
the receiving end of RTP buffers incoming packets to a 
20 jitter buffer and reproduces them at a given reproduction 
time. A packet arriving before the reproduction time par- 
ticipates in the reconstruction of the original signal. A 
packet arriving after the reproduction time remains un- 
used and rejected. 
25 [0032] Figure 5 illustrates the operation of an RTP al- 
gorithm. In the figure, the letter t refers to the transmis- 
sion time of the packet, the letter a to the reception time 
and p to the reproduction time. Superscripts indicate the 
number of the packet and subscripts the number of the 
30 speech item. In the K th speech item, the packets arrive 
at the receiving end after a varying transmission time. 
The RTP algorithm then reproduces them at the correct 
moment. In the (K+1) th speech item, packets 1 and 2 
change their order and packet 4 arrives after its repro- 
35 duction time, and is thus rejected. The RTP algorithm 
returns the packets to the correct order, reproduces 
them at the correct moment and indicates for corrective 
action, for instance, which packets are missing or are 
late. The reproduction delay is time t(reproduction de- 
40 lay) = t(repro duction) — t(transmission). The RTP algo- 
rithm makes sure that the reproduction delay remains 
constant during the entire speech item. 
[0033] The delay of the IP packet through the IP net- 
work t=t(input) — t(output) is made up of two factors. L 
45 is a fixed delay that depends on the transmission time 
and the average queue time. J is a varying delay that 
depends on a varying queue time inside the IP network 
and causes jitter. The receiving end of the IP network 
has a jitter buffer that stores the packets in its memory, 
50 if the transmission time t < t(reproduction delay). Deter- 
mining the reproduction delay is a compromise solution. 
On one hand, a real-time application requires an as 
short end-to-end delay as possible, and consequently 
the reproduction delay should be reduced. On the other 
55 hand, a long reproduction delay allows a long time for 
the packets to arrive and thus, more packets can be ac- 
cepted. The value of the reproduction. delay should thus 
be adjusted continuously according to the network con- 
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ditions. Figure 6 illustrates this. A packet having a trans- 
mission timet < L+ J can be accepted, whereas a packet 
having a transmission time t > L + J is rejected. By in- 
creasing J, it is thus possible to increase the number of 
accepted packets. The reproduction delay can be ad- 
justed for instance by starting with a small value and in- 
creasing it regularly until the proportion of late packets 
is below a certain limit, for instance 1%. 
[0034] Most RTP algorithms have a facility that ad- 
justs the reproduction delay automatically according to 
the network conditions to improve sound quality. The re- 
production delay can be shifted 60 ms forward, for in- 
stance, in such a manner that a 60-ms replacement 
speech packet is created in RTP reception before the 
speech flow continues. In other words, an extra frame 
is added to the speech flow. Figure 7 shows a frame 
string 75 to which one or more extra frames 72 are add- 
ed to obtain a frame string 76 for onward transmission. 
The reproduction delay can be shifted 60 ms backward 
in such a mannerthat an entire speech frame is deleted 
in RTP reception. 

[0035] In Figure 1 , RTP transmission thus takes place 
between the gateway GW and terminal equipment TE 
over the packet network PDN. The task of the gateway 
GW is to convert the circuit-switched speech (or other 
data) coming from the exchange DXT overthe PCM line 
into IP speech packets and vice versa. In the TETRA 
infrastructure, speech data is transmitted in frames, so 
a natural RTP packet would contain one frame of 
speech data. One RTP packet would then contain 60 
ms speech and it would correspond directly to the con- 
tent of one speech frame. Another possibility is to use 
an RTP packet containing only half a frame of speech 
data (30 ms). A half-frame packet has the following 
properties as compared with a complete-frame packet: 

1 ) When the gateway receives half-frame packets, it has 
to wait for two packets to arrive before the start of an 
ISI-frame transmission. The control bits (BFI, C- or U- 
stolen) concerning both speech blocks are namely atthe 
beginning of the frame and the gateway must define 
them on the basis of the type of the half-frame packets. 

2) When an RTP packet is lost, only 30 ms of speech is 
missing as opposed to 60 ms. When optimising speech 
quality, the length of the packet is a compromise be- 
tween two viewpoints. One extreme is a short packet, 
as a result of which the number of missing packets in- 
creases in an inversely proportional manner to the size 
of the packets, and distortions then occur more often. 
The other extreme is a long packet in which distortions 
occur more rarely, but which has a probability of losing 
an entire phoneme, and therefore, the intelligibility of 
speech becomes poorer especially when the length of 
the packet is over 20 ms. The latter limit is namely the 
shortest length of a phoneme. 3) For bandwidth, a long 
packet is, however, more efficient, since the length (36 
to 40 bytes) of the headers (Ethernet + IP + UDP + RTP) 
is already long in comparison with the length of the pay- 
load (18 bytes / speech block or 36 bytes / speech 



frame). The share of the headers in a packet can be re- 
duced by two techniques. Multiplexing allows several 
speech channels to be packed in one RTP packet, thus 
reducing the share of the headers. This is a suitable so- 

5 lution for an exchange-to-dispatching point connection, 
since this way, all group calls and an individual call can 
be transmitted in one packet. A second technique that 
is suitable for serial connections, is compression of the 
headers. This way : the IP/UDP/RTP header can be 

10 shortened considerably (2 to 4 bytes), thus saving band- 
width. To achieve a better sound quality, a short RTP 
packet (30 ms), is therefore, more preferable. 
[0036] Speech blocks can be stolen from a frame for 
use by the network (C-stolen) or user (U-stolen). For in- 

15 stance, when using end-to-end encryption, terminals 
steal one speech blockfortheir own purpose 1 to 4 times 
a second for the transmission of the synchronization 
vector, as described above. 

[0037] The RTP standard and many IP speech termi- 
20 nals support ACELP codecs, butthe RTP standard does 
not support the TETRA-specific ACELP. An RTP packet 
with the following settings, for instance, can be used for 
speech transmission: RTP version 2, no filling, no ex- 
tension, no CRSC sources, no marker, payload type 8 
25 (same as A law), time stamp increases by 240 units for 
each packet. This corresponds to the TETRA 8000-Hz 
sampling clock and 30-ms sample length. The payload 
contains the following data: the first three bits indicate, 
if the frame error bit (BFI) is set, if the payload is sound 
30 or data, and if this is a C- or U-stolen speech block; other 
first-byte bits are not used; the next 1 37 bits are the ac- 
tual data and correspond to one speech block. The re- 
maining payload bits are 0. 

[0038] The above operation of the gateway GW be- 

35 tween a circuit-switched and a packet-switched connec- 
tion is only one possible alternative, and the operation 
of the gateway GW can differ from it without having any 
significance to the basic idea of the invention. 
[0039] The terminal equipment TE shown in Figure 1 

40 can be a speech terminal or data terminal, and the in- 
vention can be applied to audio connections, video con- 
nections, or data connections that require real-time data 
transmission. The terminal equipment TE can be a mo- 
bile station, a dispatcher workstation, base station or 

45 some other network element. The terminal equipment 
TE is not necessarily directly connected to the packet 
network PDN, but between the terminal equipment TE 
and the packet network PDN, there may be a second 
TETRA network, for instance. In such a case, the other 

50 end of the packet connection PDN also has a gateway 
element. There may also be another connection or sev- 
eral packet connections in between. If the terminal 
equipment TE is, as shown in Figure 1 , connected di- 
rectly to the packet network PDN, it acts as the other 

55 party of the RTP transmission essentially in the same 
manner as described above with reference to the gate- 
way GW. 

[0040] According to the invention, the reproduction 
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delay is increased in the receiving end GW orTE of the 
yj packet connection PDN during a data transmission, for 
instance speech item or call, in such a manner that the 
frame 72 to be added to increase the reproduction delay 
is marked as an extra frame, and further, in the receiving 5 
end of the telecommunications connection only the 
frames not marked as extra frames are counted in the 
numbernof received frames so as to obtain the correct 
value of the initialisation vector, as described above. As 
an example, let us examinethefollowing situation of Fig- 10 
ure 1 in which there is a call between the mobile station 
MS and terminal equipment TE over the packet connec- 
tion PDN according to the RTP protocol. Data transmis- 
sion according to the RTP protocol then takes place be- 
tween the gateway GW and the terminal equipment TE is 
supporting the protocol. The gateway GW is then the 
receiving end of the packet connection PDN with re- 
spect to the traffic coming from the terminal equipment 
TE. When a need is detected according to the RTP al- 
gorithm to increase the reproduction delay, one or more 20 
extra frames 72 are added in the gateway GW to the 
received frame string 75 and the thus obtained frame 
string 76 is transmitted on to the mobile station MS. The 
added extra frames 72 are also marked in the gateway 
GW in such a manner that the recipient, i.e. in this case 25 
the mobile station MS, recognizes them as extra frames 
and does not count them in the number n of received 
frames. Thus, the encryption algorithm of the mobile sta- 
tion MS keeps the correct synchronization. The terminal 
equipment TE, which is the receiving end of the packet 30 
connection PDN with respect to the traffic coming from 
the mobile station MS, marks correspondingly any extra 
frames 72 possibly added to increase the reproduction 
delay. This way, it is possible to identify in the frame 
string to be forwarded next to decryption and reproduc- 35 
tion the extra frames that are not counted in the number 
n of received frames. The control of the reproduction de- 
lay in the terminal equipment TE is thus done before the 
filter block 25 in Figure 2. A frame to be added to in- 
crease the reproduction delay can be marked as extra *o 
in a manner agreed in advance. The manner of the 
marking is not significant for the basic idea of the inven- 
tion. The most important thing is that the receiving party 
of the telecommunications connection can identify the 
extra frames. The marking can be done for instance us- 45 
ing a special parameter reserved for this purpose that 
is transmitted in the C-stolen second speech block of 
the extra frame 72. Each extra frame can be marked or, 
if several extra frames are transmitted one after the oth- 
er, it is also possible to mark only the first extra frame 50 
and indicate the number of extra frames following it. 
[0041] It is obvious to a person skilled in the art that 
while technology advances, the basic idea of the inven- 
tion can be implemented in many different ways. The 
invention and its embodiments are thus not restricted to 55 
the examples described above, but can vary within the 
scope of the claims. 



Claims 

1 . A method for maintaining end-to-end synchroniza- 
tion on a telecommunications connection transmit- 
ting data in frames substantially in real time and us- 
ing synchronized end-to-end encryption, wherein 
an initialisation vector value corresponding to a re- 
ceived frame and used in decrypting the frame is 
defined on the basis of the number of frames re- 
ceived at the receiving end of the telecommunica- 
tions connection, and wherein at least a part of the 
telecommunications connection is a packet- 
switched connection, in which case the reproduc- 
tion delay of the data to be transmitted can be in- 
creased by adding one or more extra frames to the 
frame string being transmitted, characterized by 

marking a frame to be added to increase the 
reproduction delay as an extra frame, and 
counting only the frames not marked as extra 
frames in the number of received frames. 

2. A method as claimed in claim 1 , characterized in 
that 

the reproduction delay is increased in the re- 
ceiving end of the packet-switched connection. 

3. A method as claimed in claim 1 or 2, characterized 
in that 

the packet-switched connection uses an In- 
ternet protocol. 

4. A method as claimed in claim 1 , 2 or 3, character- 
ized in that the telecommunications connection be- 
longs to the TETRA system. 

5. A method as claimed in any one of claims 1 to 4, 
characterized in that the extra frame added to in- 
crease the reproduction delay comprises a stolen 
speech block, and said marking is done in the stolen 
speech block. 

6. A method as claimed in any one of claims 1 to 5, 
characterized in that the encryption is done using 
a key stream segment generated using the initiali- 
sation vector. 

7. An arrangement for maintaining end-to-end syn- 
chronization on a telecommunications connection 
transmitting data in frames substantially in real time 
and using end-to-end encryption, wherein at least 
a part of the telecommunications connection is a 
packet-switched connection (PDN), in which case 
the reproduction delay of the data to be transmitted 
can be increased by adding one or more extra 
frames (72) to the frame string (75) being transmit- 
ted, the arrangement comprising 
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8. 



9. 



11. 



12. 



13. 



means (MS, TE) for defining on the basis of the 
number of received frames an initialisation vec- 
tor value corresponding to a frame received at 
the receiving end of the telecommunications 
connection and used in decrypting the frame, 
characterized in that the arrangement also 
comprises 

means (GW, TE) for adjusting the reproduction 
delay that are arranged to mark the frame to be 
added to increase the reproduction delay as an 
extra frame, whereby the means (MS, TE) for 
defining the initialisation vector value are ar- 
ranged to count only the frames not marked as 
extra frames in the number of received frames. 

An arrangement as claimed in claim 7, character- 
ized in that the means (GW, TE) for adjusting the 
reproduction delay reside in the receiving end of the 
packet-switched connection (PDN). 

An arrangement as claimed in claim 7 or 8, char- 
acterized in that the packet-switched connection 
(PDN) uses an Internet protocol. 



1 0. An arrangement as claimed in claim 7, 8 or 9, char- 
acterized in that the telecommunications connec- 
tion belongs to the TETRA system. 



An arrangement as claimed in any one of claims 7 
to 10, characterized in that the extra frame (72) 
added to increase the reproduction delay compris- 
es a stolen speech block, and the means (GW, TE) 
for adjusting the reproduction delay are arranged to 
do said marking in the stolen speech block. 

An arrangement as claimed in any one of claims 7 
to 11 , characterized in that the encryption is done 
using a key stream segment generated using the 
initialisation vector. 

A network element for maintaining end-to-end syn- 
chronization on a telecommunications connection 
transmitting data in frames substantially in real time 
and using end-to-end encryption, wherein an initial- 
isation vector value corresponding to a received 
frame and used in decrypting the frame is defined 
on the basis of the number of frames received at 
the receiving end of the telecommunications con- 
nection, and wherein at least a part of the telecom- 
munications connection is a packet-switched con- 
nection (PDN), in which case 

the network element (GW, TE) is arranged to 
increase when necessary the reproduction de- 
lay of the data to be transmitted by adding one 
or more extra frames (72) to the frame string 
(75) being transmitted, characterized in that 
the network element is also arranged to mark 
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the frame added to increase the reproduction 
delay as an extra frame. 

14. A network element as claimed in claim 13, 
characterized in that the network element resides 
in the receiving end of the packet-switched connec- 
tion (PDN). 

15. A network element as claimed in claim 13 or 14, 
characterized in that the extra frame (72) added 
to increase the reproduction delay comprises a sto- 
len speech block, and the network element is ar- 
ranged to do said marking in the stolen speech 
block. 

16. A network element as claimed in claim 13, 14or15, 
characterized in that the packet-switched connec- 
tion (PDN) uses an Internet protocol. 

17. A network element as claimed in any one of claims 
13 to 16, characterized in that the telecommuni- 
cations connection belongs to the TETRA system. 

18. A network element as claimed in any one of claims 
13 to 17, characterized in that the encryption is 
done using a key stream segment generated using 
the initialisation vector. 

19. A network element as claimed in claim 17 or 18, 
characterized in that the network element is a 
TETRA dispatcher workstation. 

20. A network element as claimed in any one of claims 
13 to 18, characterized in that the network ele- 
ment is a base station. 

21. A network element as claimed in any one of claims 
13 to 18, characterized in that the network ele- 
ment is a media gateway. 

22. A network element that uses a telecommunications 
connection transmitting data in frames substantially 
in real time and using a synchronized end-to-end 
encryption, wherein at least a part of the telecom- 
munications connection is a packet-switched con- 
nection (PDN), in which case the reproduction delay 
of the data to be transmitted can be increased by 
adding one or more extra frames (72) to the frame 
string (75) being transmitted, and 

the network element (TE, MS) is arranged to 
define on the basis of the number of received 
frames an initialisation vector value corre- 
sponding to a received frame and used in de- 
crypting the frame, characterized in that 
the network element is also arranged, when the 
frames added to increase the reproduction de- 
lay are marked as extra frames, to count in the 
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number of received frames only the frames that 
are not marked as extra frames. 

23. A network element as claimed in claim 22, 
characterized in that the extra frame (72) added 5 
to increase the reproduction delay comprises a sto- 
len speech block, and said marking is in the stolen 
speech block. 

24. A network element as claimed in claim 22 or 23, 10 
characterized in that the packet-switched connec- 
tion (PDN) uses an Internet protocol. 

25. A network element as claimed in claim 22, 23 or 24, 
characterized in that the telecommunications con- * 5 
nection belongs to the TETRA system. 

26. A network element as claimed in any one of claims 
22 to 25, characterized in that the encryption is 
done using a key stream segment generated using 20 
the initialisation vector. 

27. A network element as claimed in claim 25 or 26, 
characterized in that the network element is a 
TETRA dispatcher workstation. 25 

28. A network element as claimed in any one of claims 
22 to 26, characterized in that the network ele- 
ment is a base station. 

30 

29. A network element as claimed in any one of claims 
22 to 26, characterized in that the network ele- 
ment is a mobile station. 

35 
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(57) A method and an arrangement for maintaining 
end-to-end synchronization on a telecommunications 
connection transmitting data in frames substantially in 
real time and using synchronized end-to-end encryp- 
tion, wherein at least a part of the telecommunications 
connection is a packet-switched connection (PDN), in 
which case the reproduction delay of the data to be 
transmitted can be increased by adding one or more ex- 
tra frames (72) to the frame string (75) being transmit- 
ted, wherein the arrangement comprises means (MS, 



TE) for defining on the basis of the number of received 
frames an initialisation vector value corresponding to a 
frame received at the receiving end of the telecommu- 
nications connection and used in decrypting the frame, 
and means (GW : TE) for adjusting the reproduction de- 
lay that are arranged to mark the frame to be added to 
increase the reproduction delay as an extra frame, and 
the means (MS, TE) for defining the initialisation vector 
value are arranged to count only the frames not marked 
as extra frames in the number of received frames. 
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